Fix in-memory initializer handling for non-CPU device#24978
Fix in-memory initializer handling for non-CPU device#24978fs-eire wants to merge 1 commit intomicrosoft:mainfrom
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR addresses an issue where in-memory initializers assigned to non-CPU devices were not correctly copied to their target device. The changes add an extra step to check the device type and, if needed, allocate a new tensor on the non-CPU device and perform a copy of the initializer.
- Added a conditional branch to check if the device is non-CPU and perform tensor copying.
- Incorporated error handling for string tensors during the copy operation.
- Maintained backward compatibility by preserving the original initializer handling logic when on CPU.
Comments suppressed due to low confidence (1)
onnxruntime/core/framework/session_state_utils.cc:415
- [nitpick] Consider adding a comment explaining why initializers assigned to non-CPU devices require an explicit copy from CPU memory, to help future maintainers understand the rationale behind this conditional branch.
if (device_type != OrtDevice::CPU) {
| // if the initializer is on a non-CPU device, copy it from CPU to the device. | ||
| const auto& initializer_tensor = ort_value.Get<Tensor>(); | ||
| if (initializer_tensor.GetElementType() == ONNX_NAMESPACE::TensorProto_DataType_STRING) { | ||
| return ORT_MAKE_STATUS(ONNXRUNTIME, FAIL, "string tensor is not supported for copying between allocators"); |
There was a problem hiding this comment.
The error message for string tensors ('string tensor is not supported for copying between allocators') could be enhanced with additional context to guide users toward proper usage or handling. Consider providing a more descriptive message or suggesting alternative approaches where applicable.
| return ORT_MAKE_STATUS(ONNXRUNTIME, FAIL, "string tensor is not supported for copying between allocators"); | |
| return ORT_MAKE_STATUS(ONNXRUNTIME, FAIL, | |
| "String tensor copying between allocators is not supported due to limitations in ONNX Runtime. " | |
| "Consider preprocessing string tensors on the CPU or avoiding operations that require copying them between devices."); |
|
Add a test case? |
|
The original issue can be fixed by #23979. As suggested by @skottmckay we can wait for it to be merged first. |
|
Close this PR. The problem is fixed by #23979. |
Description
In-memory initializers are currently not handled correctly for non-CPU devices. If the memory planner assigned the initializer to a non-CPU device, the initializer is not copied to the device during the initialization steps.
This PR fixes this problem by adding an extra step to copy the initializer to the device if needed.
Motivation and Context
Fixes #24768